Enabling scalable and accurate clustering of distributed ligand geometries on supercomputers
نویسندگان
چکیده
منابع مشابه
Enabling scalable and accurate clustering of distributed ligand geometries on supercomputers
We present an efficient and accurate clustering method for the analysis of protein-ligand docking datasets on large distributed-memory systems. For each ligand conformation in the dataset, our clustering algorithm first extracts relevant geometrical properties and transforms the properties into a single metadata point in the N-dimensional (N-D) space. Then, it performs an N-D clustering on the ...
متن کاملScalable and Accurate Algorithm for Graph Clustering
One of the most useful measures of quality for graph clustering is the modularity of the partition, which measures the difference between the number of the edges with endpoints in the same cluster and the expected number of such edges in a random graph. In this paper we show that the problem of finding a partition maximizing the modularity of a given graph G can be reduced to a minimum weighted...
متن کاملEnabling scalable spectral clustering for image segmentation
Spectral clustering has become an increasingly adopted tool and an active area of research in the machine learning community over the last decade. A common challenge with image segmentation methods based on spectral clustering is scalability, since the computation can become intractable for large images. Down-sizing the image, however, will cause a loss of finer details and can lead to less acc...
متن کاملScalable Density-Based Distributed Clustering
Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex data reside on different, independently working computers which are connected to each other via local or wide area networks. In this paper, we propose a scal...
متن کاملScalable and Distributed Clustering via Lightweight Coresets
Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive data sets. While existing approaches generally only allow for multiplicative approximation errors, we propose a novel notion of coresets called lightweight cor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Parallel Computing
سال: 2017
ISSN: 0167-8191
DOI: 10.1016/j.parco.2017.02.005